Multiterminal Source Coding with an 
Entropy-Based Distortion Measure 



Thomas A. Courtade and Richard D. Wesel 
Department of Electrical Engineering 
University of California, Los Angeles 
Los Angeles, California 90095 

Email: tacourta@ee.ucla.edu; wesel@ee.ucla.edu 



Abstract — In this paper, we consider a class of multiterminal 
source coding problems, each subject to distortion constraints 
computed using a specific, entropy-based, distortion measure. 
We provide the achievable rate distortion region for two cases 
and, in so doing, we demonstrate a relationship between the 
lossy multiterminal source coding problems with our specific 
distortion measure and (1) the canonical Slepian-Wolf lossless 
distributed source coding network, and (2) the Ahlswede-Korner- 
Wyner source coding with side information problem in which 
only one of the sources is recovered losslessly. 
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Fig. 1. Classical multiterminal source coding network. 



I. Introduction 



A. Background 



A complete characterization of the achievable rate distortion 
region for the classical lossy multiterminal source coding 
problem depicted in Fig. [TJhas remained an open problem for 
over three decades. Several special cases have been solved: 

• The lossless case where D x = 0, D y = 0. Slepian and 
Wolf solved this case in their seminal work (TJ. 

• The case where one source is recovered losslessly: 
i.e., D x — 0,D y = D max . This case corresponds 
to the source coding with side information problem of 
Ahlswede-Korner-Wyner (2), (3J. 

• The Wyner-Ziv case p) where Y n is available to the 
decoder as side information and X n should be recovered 
with distortion at most D x . 

• The Berger-Yeung case (which subsumes the previous 
three cases) |5j where D x is arbitrary and D y = 0. 

Despite the apparent progress, other seemingly fundamental 
cases, such as when D x is arbitrary and D y = D max , remain 
unsolved except perhaps in very special cases. 

B. Our Contribution 

In this paper, we give the achievable rate region for two 
cases subject to a particular choice of distortion measure d(-), 
defined in Section [II] Specifically, for our particular choice 
of d(-), we give the achievable rate distortion region for the 
following two cases: 

• The situation when X and Y are subject to a joint 
distortion constraint given a reproduction Z: 

E \d(X, Y, Z)] < D. 



The case where X is subject to a distortion constraint 
given a reproduction V: 



E 



d(X,V) 



< D x , 



and there is no distortion constraint on the the reproduc- 
tion of Y (i.e., D y =D max ). 

The regions depend critically on our choice of d(-), which 
can be interpreted as a natural measure of the soft information 
the reproduction Z symbol provides about the source symbols 
X and Y (resp. the information V provides about X). 

The remainder of this paper is organized as follows. In 
Section [TT] we formally define the problem and provide our 
main results. In Section III we discuss the properties of 
d(-) and provide the proofs of our main results. Section 
|rv] delivers the conclusions and a brief discussion regarding 
further directions. 

II. Problem Statement and Results 

In this paper, we consider two cases of the lossy multiter- 
minal source coding network presented in Fig. [2] 

In the first case, we study the achievable rates (R x ,R y ) 
subject to the joint distortion constraint 



E 



d(X, Y,Z) <D 



where Z is the joint reproduction symbol computed at the 
decoder from the messages f x and f y received from the X- 
and F-encoders respectively. 

In the second case, we study the achievable rates (R x , R y ) 
subject to a distortion constraint on X: 

E\d{X,V)] <D X , 




where V is the reproduction symbol computed at the decoder 
from the messages f x and f y received from the X- and Y- 
encoders respectively. In this second case, there is no distortion 
constraint on Y. 

Definition 1: To simplify terminology, we refer to the first 
and second cases described above as the Joint Distortion (JD) 
network and X-Distortion (XD) network respectively. 



Z: Ed(X,Y,Z)<D 
^ or 
V: Ed(X,V)<D x 



Fig. 2. The Joint Distortion (JD) and X-Distortion (XD) networks. 

Formally, define the source alphabets as X = {1,2,..., m} 
and y = {1, 2, . . . ,£}. We consider the discrete memoryless 
source sequences X n and Y n drawn i.i.d. according to the 
joint distribution p(x,y). Let X n be available at the X- 
encoder and Y n be available at the F-encoder as depicted in 
Fig. [2] (We will informally refer to probability mass functions 
as distributions throughout this paper.) 

For the case of joint distortion, we consider the reproduction 
alphabet Z = A mX £, where Aj, denotes the set of probability 
distributions on k points. In other words, for z E Z, z = 
(qi,i, ■ ■ ■ ,q m ,e) where q itj > and V ( ; </,., = 1. With z 
defined in this way, it will be convenient to use the notation 
z (x, y) = q X:V for x E X, y E y. Note that the restriction 
of the reproduction alphabet to the probability simplex places 
constraints on the function z(x,y). For example, one cannot 
choose z(x, y) = x + y. 

Define the joint distortion measure d : X x y x Z ^ R + 

by 



d{x,y,z) = log 



z(x,y) 



(1) 



and the corresponding distortion between the sequences 

(x n , y n ) and z n as 



1 - / i 

d(x n ,y n ,z n ) = -^og (— r 



(2) 



As we will see in Section III the distortion measure d(-) 
measures the amount of soft information that the reproduction 
symbols provide about the source symbols in such a way that 
the expected distortion can be described as an entropy. For 
example, given the output from a discrete memoryless channel, 
the minimum distortion between the channel input and output 
is the conditional entropy. For this reason, we refer to d(-) as 
an entropy-based distortion measure. 

The function d(-) is a natural distortion measure for practical 
scenarios. A similar distortion measure has appeared previ- 
ously in the image processing literature (6} and in the study 
of the information bottleneck problem pj. However, it does 



not appear to have been studied in the context of multiterminal 
source coding. 

A (2 nRx ,2 nRy ,n)-mte distortion code for the JD network 
consists of encoding functions, 



f y -.y n ^ {1,2,. 



2 nR 



}■ 



and a decoding function 



{1,2,. 



-)ni? a 



} x {l,2,...,2 njK *} 



A vector (R x ,R y ,D) with nonnegative components is 
achievable for the JD network if there exists a sequence of 
(2 nR * , 2 nRy , ?i)-rate distortion codes satisfying 

lim E[d(X n ,Y n ,g(f x (X n ),f v (Y n ))]<D. 

n— ¥00 

Definition 2: The achievable rate distortion region, 1Z, for 
the JD network is the closure of the set of all achievable 
vectors (R x ,R y ,D). 

In a similar manner, we can also consider the case when 
there is only a distortion constraint on X rather than a 
joint distortion constraint on X, Y. For this, we consider the 
reproduction alphabet V = A m . With v defined in this way, it 
will be convenient to use the notation v(x) = q x for x E X. 

We define the distortion measure d x : X x V — > M. + by 



d x (x,v) = log 



1 



v(x) 



(3) 



and the corresponding distortion between the sequences x r ' 
and v n as 



d x (x n ,v n ) 



1 



log 



1 



Vi(Xi) 



(4) 



Identical to the case for the JD network, we can define a 
(2 nRx , 2 nRy , n)-rate distortion code for the XD network, with 
the exception that the range of the decoding function g(-) is 
the reproduction alphabet V. 

A vector (R x ,R y ,D x ) with nonnegative components is 
achievable for the XD network if there exists a sequence of 
(2 nR * , 2 nRy , ?i)-rate distortion codes satisfying 

lim E[d x (X n ,g(f x (X n )J y (Y n ))} < D x . 

n— ^00 

Definition 3: The achievable rate distortion region, 1Z X , for 
the XD network is the closure of the set of all achievable 
vectors (R x ,R y ,D x ). 

Our main results are stated in the following theorems: 

Theorem 1: 



^S x ,5 y >0 such that 

D > S x + Sy 

K = {(R x ,Ry,D): R X + S X >H(X\Y) 



R y + S y > H(Y\X) 
R x + R y + D>H(X,Y) J 



Theorem 2: 



on the expected distortion conditioned on U = u: 



X + D X >H(X\U) 
y>HY-U) 



Tlx = { (R x , Ryi D x ) : for some distribution 

p(x,y,u) = p (x,y)p(u\y), 
where \U\ < \y\ + 2. 

Since the distortion measure is reminiscent of discrete 
entropy, we can think of the units of distortion as "bits" of 
distortion. Thus, Theorem [T] states that for every bit of dis- 
tortion we allow for X, Y jointly, we can remove exactly one 
bit of required rate from the constraints defining the Slepian- 
Wolf achievable rate region. Indeed, we prove the theorem by 
demonstrating a correspondence between a modified Slepian- 
Wolf network and the multiterminal source coding problem in 
question. 

Similarly, when we only consider a distortion constraint on 
X, Theorem [2] states that for every bit of distortion we tolerate, 
we can remove one bit of rate required by the X -encoder in 
the Ahlswede-Korner- Wyner region. 

The proofs of Theorems [T] and [2] are given in the next 
section. 

III. Proofs 

We choose to prove Theorems [T] and [2] by showing a 
correspondence between schemes that achieve a prescribed 
distortion constraint and the well-known lossless distributed 
source coding scheme of Slepian and Wolf, and the source 
coding with side-information scheme of Ahlswede, Korner, 
and Wyner. This provides a great deal of insight into how the 
various distortions are achieved. 

In each case, the proof relies on a peculiar property of 
the distortion measure d(-). Namely, the ability to convert 
expected distortions to entropies that are easily manipulated. 
In the following subsection, we discuss the properties of the 
distortion measure d(-). 

A. Properties of d(-) 

As stated above, one particularly useful property of d(-) 
is the ability to convert expected distortions to conditional 
entropies. This is stated formally in the following lemma. 

Lemma 1: Given any U arbitrarily correlated with 
(X n ,Y n ), the estimator Z n [U] produces the expected 
distortion 



E 



1 



d{X n ,Y n ,Z n ) >-YH(X l7 Yi\U) 



Moreover, this lower bound can be achieved by setting 
Zi[u](x, y) := Pr (X % = x,Y t = y\U = u). 

Proof: Given any U arbitrarily correlated with (X n ,Y n ), 
denote the reproduction of (X n , Y n ) from U as Z n [U] € Z n . 
By definition of the reproduction alphabet, we can consider the 
estimator Z n [U] to be some probability distribution on X x y 
conditioned on U. Then, we obtain the following lower bound 



E d(X n ,Y n ,Z n )\U = u 

= ^S*/ ,fe9l " ,l0S (™) 

1 " 

= -Y\D [pi(x,y\u)\\zi[u]{x, y)) + H(Xi,Yi\U = u) 

i=l 
1 n 

>-Y^H{X l ,Y l \U = u), 

i=l 

where p i (x,y\u) — Pr (X,i = x,Yi — y\U — u) is the true 
conditional distribution. Averaging both sides over all values 
of U, we obtain the desired result. Note that the lower bound 
can always be achieved by setting Zi[u](x,y) :— pi(x,y\u). 

m 

We now give two examples which illustrate the utility of 
the property stated in Lemma [T] 

Example 1: Consider the following theorem of Wyner and 
Ziv g): 

Theorem 3: Let (X, Y) be drawn i.i.d. and let d(x, z) be 
given. The rate distortion function with side information is 

R Y (D)= min min I(X;W\Y) 

p(w\x) f 

where the minimization is over all functions / : y x 
W — > Z and conditional distributions p(w\x) such that 
E[d(x,f(y, w))]<D. 

For an arbitrary distortion measure, Ry(D) can be difficult to 
compute. In light of Lemma [T] and its proof, we immediately 
see that: 

R Y (D) = H{X\Y) - D. 

Example 2: As a corollary to the previous example, taking 
Y = we obtain the standard rate distortion function for a 
source X n : 

R(D) = H(X) - D. 

In both examples, we make the surprising observation that 
the distortion function d(-) yields a rate distortion function 
that is a multiple of the rate distortion function obtained using 
the "erasure" distortion measure d°°(-) defined as follows: 

{0 if z = x 
00 if z ^ x and x 7^ e (5) 
1 if z = e. 

This is somewhat counter-intuitive given the fact that an 
estimator is able to pass much more "soft" information to 
the distortion measure d(-) compared to d°°(-). It would be 
interesting to understand whether or not this relationship holds 
for general multiterminal networks, however this issue remains 
open. 

Definition 4: We have defined d(-) to be a joint distortion 
measure on X x y, however it is possible to decompose it in 
a natural way. We can define the marginal and conditional 
distortions for X and Y\X respectively by decomposing 



%i[u](x,y) = z i (x\u)z i (y\x,u) (note the slight abuse of 
notation). Thus, if the total expected distortion is less than D, 
we define the marginal and conditional distortions D x , and 



D y \ x as follows: 



D > E 

= E 



d(X n ,Y n ,Z n ) 
d x {X n ,Z n ) 



E 



d ylx (Y n ,Z n ) 



1 " 

n c — ' 

i=l 

In a complimentary manner, we can decompose the expected 
distortion into D y , D x \ y satisfying D > D y + D x \ y . 

The definitions of expected total, marginal, and conditional 
distortion allow us to bound the number of sequences that 
are "distortion-typical". First, we require a result on peak 
distortion. 

Lemma 2: Suppose we have a sequence of (2 nRm , 2 nRy , n)- 
rate distortion codes satisfying 

lim E[d(X n ,Y n , g(f x (X n ), f y (Y n ))} < D. 

n— too 

For any e > 0, Pr \d{X n , Y n , Z n ) > D + e} < e for a 
sufficiently large blocklength n. 

Proof: Suppose a length n code satisfies the expected 



distortion constraint E 



< D + e/2. By 



d(X n ,Y n ,Z n ) 

repeating the code N times, we obtain N i.i.d. realizations of 
(X n ,Y n ,Z n ) ~ p(X n ,Y n ,Z n ). By the weak law of large 
numbers: 



Pr 



\d(X Nr \Y Nn ,Z Nn ) > L> + e} < e 



for iV sufficiently large. ■ 
Now, we take a closer look at the sets of source sequences 

that produce a given distortion. 

Lemma 3: Let_4(z") = {(x n ,y n ) : d(x n ,y n ,z n ) < D+e} 

for some e > 0. The size of A(z n ) is bounded from above by 

\A{z n )\ < 2 n( - D+2c \ 

Proof: For each (x n , y n ) € A(z n ), we can rearrange (j4j) 

to obtain 



l<2 n ^l[z l (x l ,y l ). 



(6) 



By the definition of z n , observe that Y\2=i %i{ x ii Hi) i s a valid 
probability measure on X n x y n . Thus, for any subset S C 
X n x y n , we have 



^2 Y[zi(%i,yi) < l. 

( E ",y")eSi=l 



(7) 



Combining (|6|l and |7]) gives the desired result: 

\A{z n )\ = E 1 

n 

(x« ,y n )eA(M n ) *=1 

■ 

We can also modify the previous result to include sequences 
which satisfy marginal and conditional distortion constraints. 

Lemma 4: Let A x (z n ) = {x n : d x (x n ,z n ) < D x + e} and 
■Ay\x(z n ) = {y n ■ d y \ x {y n , z n ) < D y \ x + e} for some e > 0. 
The sizes of these sets are bounded as follows: 

\A x {z n )\ < 2™(^+ 2£ ), and 

\Ay\ x {z n )\ < 2 n{D y^ +2t) 

for sufficiently large n. Symmetric statements hold for A y {z n ) 
and A x \ y (z n ). 

Proof: The proof is nearly identical to that of Lemma [3] 
and is therefore omitted. ■ 

B. Proof of Theorem [7J 

As mentioned previously, we prove Theorem [Tj by demon- 
strating a correspondence between the JD network with a joint 
distortion constraint and a Slepian-Wolf network. To this end, 
we now define a modified Slepian-Wolf code. Essentially the 
code splits the rates of each user into two parts. We refer 
to this network as the Split-Message Slepian-Wolf (SMSW) 
network. 

A {2 nR * ,2 nR y ,2 nR °°> ,2 nR °y ,n)-SW (Slepian-Wolf) code 
for the SMSW network consists of encoding functions, 

X n -> {1,2,... ,2"**} 
^ x :X n ^{l,2, 

i> y --y n ^ {1,2,. 



■)nSi 



and a decoding function 

X : [2 nR "} x [2 nR y] x [2" 51 ] x [2 nS2 ] X n x y n . 

A vector (R x , R y , S\, 62) with nonnegative components is 
achievable for the SMSW network if there exists a sequence 
of (2 nR * , 2 nR y , 2 nSl , 2 nS2 , n)-SW codes satisfying 

lim Pr{(X n ,Y n )^x(4> x ,4>y,*l>)} = 0- 

n—>oo 

Definition 5: The achievable region, TZsw, f° r tne SMSW 
network is the closure of the set of all achievable vectors 
(R x ,R y , 81,62)- 

Theorem 4 ( /^): The achievable rate region IZsw con- 
sists of all rate tuples (R x , R y , 6\, 82) satisfying 

Rx + 61 > H(X\Y) 
R y + 6 2 > H(Y\X) 



R.i 



R y + S 1 +8 2 >H(X,Y). 



Claim 1: If (R x ,R y ,D) is an achievable rate-distortion 
vector for the JD network, then (R x , Ry, Si, 62) is an achiev- 
able rate vector for the SMS W network for some Si, S 2 > 
such that Si+S 2 < D. 

Proof: Suppose we have a sequence of (2"- Rx , 2 nRy , n)- 
rate distortion codes satisfying 

lim E[d(X n ,Y n , g(f x (X n ), f y (Y n ))] < D. 

n— ^00 

From these codes, we will construct a sequence of 

( 2 nii x ; 2 nR y ^ 2 ns[ n) ^ 2 n 4' l) ( n )-SW codes satisfying 
lim,^^ S[ n) + 5 { 2 n) < D and 

lim Pv{(X n ,Y n ) ^ x(^,M X ,^y)} = 0. 
n—>oo 

The encoding procedure is almost identical to the rate 
distortion encoding procedure. In particular, set 4> x (X n ) = 
f x {X n ) and fa(Y n ) = f y (Y n ). Decompose the expected 
joint distortion into the marginal and conditional distortions 
D x ,D y \ x which must satisfy D x + D y \ x < D + e by definition. 

Define the remaining encoding functions ip x ,ip y as follows: 
Bin the X n sequences randomly into 2"(- D * +3£ ) bins and, upon 
observing the source sequence X n , set ip x (X n ) = b x (X n ) 
(where b x (X n ) is the bin index of X n ). Similarly, bin the Y n 
sequences randomly into 2™( £> «i a:+3e ) bins and, upon observing 
the source sequence Y n , set i(j y (Y n ) = b y (Y n ) (where 
b y (Y n ) is the bin index of Y n ). 

The decoder finds the unique X n in bin b x (X n ) satisfying 
d x (X n , Z n ) < D x + e. If X n X n , an error occurs. Upon 
successfully recovering X n = X n , the decoder finds the 
unique Y n such that d y \ x (Y n , Z n ) < D y \ x + e. If Y n £ Y n , 
an error occurs. 

The various sources of error are the following: 

1) An error occurs if d x [X n ,g{fa,fa)) > D x + e or 
dy\ x {Y n ,g{fa,fa)) > D y \ x + e. By Lemma|2j this type 
of error occurs with probability at most e. 

2) An error occurs if there is some other X n ^ 
X n in bin b x (X n ) satisfying d x (X n , g(fa, fa)) < 
D x + e. By Lemma [3] and the observation that that 
Pr|x" G bin b x (X"~)j = 2- n( - D * +3e K this type of 
error occurs with arbitrarily small probability. 

3) An error occurs if there is some other Y n ^ 
Y n in bin b y (Y n ) satisfying d y \ x {Y n , g{<f> x , fa)) < 
D y \ x + £■ By Lemma |3| and the observation that that 

Prjf ™ e bin b y (Y n )X = 2-"( £l »i-+ 3£ ), this type of 

error is also small. 
At this point the proof is essentially complete, but there 
is a minor technical difficulty dealing with the sequences 
{ Si , 5% \ corresponding to the sequences of marginal 

I J n— 1 

and conditional distortions computed from Z n for each n. 
We require that there exists some Si such that S[ n ^ — > Si 
and similarly for the sequence of 5^ s. However, since 
[0, D + e] x [0, D + e] is compact, we can find a convergent 
subsequence so that the desired limits exist. 



Claim 2: If (R x , Ry, Si, S 2 ) is an achievable rate vector for 
the SMSW network, then (R x ,R y ,5i + S2) is an achievable 
rate distortion vector for the JD network. 
Proof: By Theorem |4j we must have: 

Rx > H(X\Y) - Si 
R y > H(Y\X) - S 2 
R x + Ry>H(X,Y)-Si-S 2 . 

Let D = Si+S 2 - For fixed Si,S 2 , any nontrivial (R x , R y ) pair 
in this region can be achieved by an appropriate time-sharing 
scheme between the two points 

Pi = (max{H(X|y) -D,0}, 

min{H(Y),H(Y) - (D - H(X\Y))}) , and 
P 2 = (min {H(X), H(X) - (D - H{Y\X))} , 

max{H(Y\X) ~ D,0}) . 

By the results given in Examples [T] and |2] point Pi allows 
X, Y to be recovered with distortion D. Symmetrically, point 
P 2 allows X, Y to be recovered with distortion D. Thus, using 
the appropriate time-sharing scheme to generate average rates 
(R x ,R y ), we can create a sequence of rate distortion codes 
that achieve the point (R x , R y , D) for the JD network. ■ 

C. Proof of Theorem [2] 

The proof of Theorem [2] is similar in spirit to the proof of 
Theorem [T] and has therefore been moved to the appendix. The 
key difference between the proofs is that, instead of showing 
a correspondence between 1Z and the SMSW achievable rate 
region, we show a correspondence between IZx and the 
Ahlswede-Korner-Wyner achievable rate region. 

IV. Conclusion 

In this paper, we gave the rate distortion regions for two 
different multiterminal networks subject to distortion con- 
straints using the entropy distortion measure. In the case of 
the Joint Distortion and X-Distortion networks, we observed 
that any point in the rate distortion region can be achieved 
by timesharing between points in the SMSW region and 
the Ahlswede-Korner-Wyner regions respectively. Perhaps this 
is an indication that the rate distortion region for more 
general multiterminal source networks (subject to distortion 
constraints using the entropy distortion measure) can be char- 
acterized by simpler source networks for which achievable rate 
regions are known. This is one potential direction for future 
investigation. 

Appendix 

This appendix contains a sketch of the proof for Theorem 

m 

Claim 3: If (R x ,R y ,D x ) is an achievable rate-distortion 
vector for the XD network, then (R x + D x ,R y ) is an achiev- 
able rate vector for the source coding with side information 
problem. 



Proof: Suppose we have a sequence of (2 nRx , 2 nRy , n)- 
rate distortion codes satisfying 

lim E[d x (X n ,g(f x (X n )J y (Y n ))} < D x . 

n— >oo 

The basic idea is to let the AT-encoder send f x (X n ) (requir- 
ing rate R x ) and have the F-encoder send f y (Y n ) (requiring 
rate R y ). By Lemma |4] the number of X n sequences that 
lie in A x (v n ) is less than 2"( D - +2e ). Therefore, if the X- 
encoder performs a random binning of the X n sequences 
into 2™( £>I+3£ - 1 and sends the bin index corresponding to 
the observed sequence X n (incurring an additional rate of 
D x + 3e), the decoder can recover X n losslessly with high 
probability. ■ 

Claim 4: If (R x + D x , R y ) is an achievable rate-distortion 
vector for the source coding with side information network, 
then (R x , R y , D x ) is an achievable rate distortion vector for 
the XD network. 

Proof: Since (R x + D x ,R y ) is an achievable rate vec- 
tor, there exists some conditional distribution p(u\y) so that 
Rx + D x > H(X\U) and R y > I{Y;U). WLOG, reduce 
R x and R y if necessary so that R x + D x — H(X\U) and 
R y = I(Y;U). Now, we construct a sequence of codes 
that achieve that point in the standard way. In particular, 
generate 2™^ +e ) different U n sequences independently i.i.d. 
according to p(u). Upon observing Y n , the F-encoder finds 
a jointly typical U n and sends the corresponding index to 
the decoder. At the A-encoder, bin the X n sequences into 
2n(R x +Dx+2e) b ms and, upon observing the source sequence 
X n , send the corresponding bin index to the decoder. With 
high probability, the decoder can reconstruct X n losslessly. 

From this sequence of codes, we can construct a sequence 
of rate distortion codes that achieve the point (R x , R y , D x ) as 
follows. At the A-encoder, employ the following time-sharing 
scheme: 

1) Use the lossless code described above with probability 
(1 — D X /H(X\U)). In this case, the distortion on 
X can be made arbitrarily small. Note that we can 
assume w.l.o.g. that D x < H(X\U) since distortion 
D x = H(X\U) can be achieved when the decoder only 
receives the sequence U n . 

2) With probability D X /H(X\U), the A-encoder sends 
nothing, while the ^-encoder continues to send U n . In 
this case, the distortion on X is H(X\U). 

Averaging over the two strategies, we obtain a sequence 
of rate distortion codes that achieve the rate distortion triple 
(R x ,R y ,D x ). ■ 
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